This repository provides the official implementation of the STATIC (Sparse Transition-Accelerated Trie Index for Constrained decoding) framework, as described in Su et al., 2026. STATIC is a high-performance method for enforcing outputs to stay within a prespecified set during autoregressive decoding from large language models, designed for maximum efficiency on modern hardware accelerators like GPUs and TPUs.
This article explains what temperature is in the context of language models, how it works, its relationship to beam search, and how output generation can still go haywire despite these techniques.
Deep learning has been deployed in many tasks in NLP, such as translation, image captioning, and dialogue systems. In machine translation, it is used to read source language (input) and generate the desired language (output). Similarly in a dialogue system, it is used to generate a response given a context. This is also known as Natural Language Generation (NLG).